12 research outputs found

    Group Membership Management Framework for Decentralized Collaborative Systems

    Get PDF
    Scientific and commercial endeavors could benefit from cross-organizational, decentralized collaboration, which becomes the key to innovation. This work addresses one of its challenges, namely efficient access control to assets for distributed data processing among autonomous data centers. We propose a group membership management framework dedicated for realizing access control in decentralized environments. Its novelty lies in a synergy of two concepts: a decentralized knowledge base and an incremental indexing scheme, both assuming a P2P architecture, where each peer retains autonomy and has full control over the choice of peers it cooperates with. The extent of exchanged information is reduced to the minimum required for user collaboration and assumes limited trust between peers. The indexing scheme is optimized for read-intensive scenarios by offering fast queries -- look-ups in precomputed indices. The index precomputation increases the complexity of update operations, but their performance is arguably sufficient for large organizations, as shown by conducted tests. We believe that our framework is a major contribution towards decentralized, cross-organizational collaboration

    INDIGO-DataCloud: A data and computing platform to facilitate seamless access to e-infrastructures

    Get PDF
    This paper describes the achievements of the H2020 project INDIGO-DataCloud. The project has provided e-infrastructures with tools, applications and cloud framework enhancements to manage the demanding requirements of scientific communities, either locally or through enhanced interfaces. The middleware developed allows to federate hybrid resources, to easily write, port and run scientific applications to the cloud. In particular, we have extended existing PaaS (Platform as a Service) solutions, allowing public and private e-infrastructures, including those provided by EGI, EUDAT, and Helix Nebula, to integrate their existing services and make them available through AAI services compliant with GEANT interfederation policies, thus guaranteeing transparency and trust in the provisioning of such services. Our middleware facilitates the execution of applications using containers on Cloud and Grid based infrastructures, as well as on HPC clusters. Our developments are freely downloadable as open source components, and are already being integrated into many scientific applications

    CS3 2022 - Cloud Storage Synchronization and Sharing

    No full text
    Onedata [1] is a distributed, global, high-performance data management system, which provides transparent and unified access to globally distributed storage resources and supports a wide range of use cases from personal data management to data-intensive scientific computations. Due to its fully distributed architecture, Onedata allows for creation of complex hybrid-cloud infrastructure deployments, with private and commercial cloud resources. It allows users to share, collaborate and publish data as well as perform high performance computations on distributed data. Onedata allows users to collaborate, share, and perform computations on data using applications relying on POSIX compliant data access. Onedata comprises the following services: Onezone - authorisation and distributed metadata management component that provides access to Onedata ecosystem; Oneprovider - provides actual data to the users and exposes storage systems to Onedata and Oneclient - which allows transparent POSIX-compatible data access on user nodes. Oneprovider instances can be deployed, as a single node or an HPC cluster, on top of highperformance parallel storage solutions with the ability to serve petabytes of data with GB/s throughput. Recently, Onedata was enhanced with a powerful workflow execution engine, powered by OpenFaas [2]. It allows for creation of complex data processing pipelines that can leverage the transparent access to distributed data provisioned by Onedata. In particular the workflow functionality can be used to create a comprehensive, OAIS [3] compliant, data archiving and preservation system, covering all archival requirements including ingestion, validation, curation, storage and publication. The workflow function library contains ready to use functionalities (implemented as Docker images), covering typical archiving actions such as metadata extraction, format conversion, checksum validation, virus checks and others. New custom functions can be easily added and shared among user groups. The solution was thoroughly tested running on auto-scalable Kubernetes clusters. Currently Onedata is used in European EGI-ACE [4], PRACE-6IP [5], and FINDR [6] project, where it provides data transparency layer for computation, data processing automation deployed on dynamically hybrid clouds containerised environments. REFERENCES: [1] Onedata project website. https://onedata.org. [2] OpenFaaS - Serverless Functions Made Simple. https://www.openfaas.com/. [3] David Giaretta, CCSDS Group, and CCSDS Panel. Reference model for an Open Archival Information System (OAIS). 06 2012. [4] EGI-ACE: Advanced Computing for EOSC. https://www.egi.eu/projects/egi-ace/. [5] Partnership for Advanced Computing in Europe - Sixth Implementation Phase. http://www.prace-ri.eu. [6] FINDR: Fast and Intuitive Data Retrieval for Earth Observatio

    CS3 2021- Cloud Storage Synchronization and Sharing

    No full text
    Onedata [1] is a global high-performance, transparent data management system, that unifies data access across globally distributed infrastructures and multiple types of underlying storages, such as NFS, Amazon S3, Ceph, OpenStack Swift, WebDAV, XRootD and HTTP and HTTPS servers, as well as other POSIX-compliant file systems. Onedata allows users to collaborate, share, and perform computations on data using applications relying on POSIX compliant data access. Thanks to a fully distributed architecture, Onedata allows for the creation of complex hybrid-cloud infrastructure deployments, including private and commercial cloud resources. Onedata comprises the following services: Onezone - authorisation and distributed metadata management component that provides access to Onedata ecosystem; and Oneprovider - provides actual data to the users and exposes storage systems to Onedata and Oneclient - which allows transparent POSIX-compatible data access on user nodes. Oneprovider instances can be deployed, as a single node or an HPC cluster, on top of high-performance parallel storage solutions with the ability to serve petabytes of data with GB/s throughput. Onedata introduces the concept of Space, a virtual volume, owned by one or more users, where they can organize their data under a global namespace. The Spaces are accessible to users via a web interface, which allows for Dropbox-like file management, a Fuse-based client that can be mounted as a virtual POSIX file system, a Python library (OnedataFS [2]), or REST and CDMI standardized APIs. As a distributed system Onedata can take advantage of modern scalable solutions like Kubernetes and thanks to a rich set of REST APIs and OnedataFS library it can process at scale data and metadata alike using FaaS systems like OpenFass. Currently Onedata is used in European Open Science Cloud Hub [2], PRACE-5IP [3], EOSC Synergy [4], and Archiver [5] project, where it provides data transparency layer for computation deployed on hybrid clouds. Acknowledgements: This work was supported in part by 2018-2020's research funds in the scope of the co-financed international projects framework (project no. 3905/H2020/2018/2, and project no. 3933/H2020/2018/2). [1] Onedata project website. http://onedata.org. [2] OnedataFS - PyFilesystem Interface to Onedata Virtual File System. https://github.com/onedata/fs-onedatafs. [3] European Open Science Cloud Hub (Bringing together multiple service providers to create a single contact point for European researchers and innovators.). https://www.eosc-hub.eu. [4] Partnership for Advanced Computing in Europe - Fifth Implementation Phase. http://www.prace-ri.eu. [5] European Open Science Cloud - Expanding Capacities by building Capabilities. https://www.eosc-synergy.eu. [6] Archiver - Archiving and Preservation for Research Environments). https://www.archiver-project.eu

    Interlingual Live Subtitling for Access (ILSA)

    No full text
    Live subtitles produced by respeaking are mainly intralingual, which means that there is an urgent need to train professionals who can produce interlingual live subtitles (ILS) with this technique, thus providing access to live content not only for deaf people but also for foreign audiences, including migrants and refugees. The aim of ILSA is to design (IO3 and IO4), develop (IO5), test (IO6) and recognise (IO7) the first training course for ILS and to produce a protocol for the implementation of this service in three real-life scenarios: TV, the classroom and the Parliament (IO7). The curriculum and training materials will be flexible so that they can be integrated in different learning environments, not only for HE translation students but also for professionals already working in translation and accessibility. The ILSA consortium includes four HEls (UVigo, UAntwerp, UWarsaw and UVienna) and three non-academic partners (the Galician Parliament, VRT and Dostepni.eu). The team at UVigo, leader of the project, has been working on intralingual respeaking for the past ten years. It is responsible for the only monograph on the subject and a quality assessment model that is used in over 30 countries worldwide. UVigo will work closely with the Galician Parliament to test the use of ILS in a pioneering service that can make the Parliamentary sessions accessible in Galician, thus promoting a regional language that is spoken by almost 3 million people. UAntwerp was the first university in the world to set up a training course in intralingual respeaking. It has also been involved in many research projects with the public Belgian broadcaster VRT, one of the few broadcasters to have tested ILS. UAntwerp and VRT will collaborate in ILSA to develop and test the new training programme and its implementation for the provision of ILS on TV. The team at UWarsaw has been involved in pioneering experimental research on intra and interlingual respeaking and will be working closely with Dostepni.eu, the first company to produce intra lingual live subtitles for TV and social events in Poland. They will test the training and provision of ILS in conferences and in the classroom as a means of access to education for deaf and foreign people. UVienna, one of the world-leading research and training institutions in interpreting, will provide the necessary expertise regarding the simultaneous interpreting skills required for the new profile. ILSA is also supported by 25 associated partners from five continents, thus ensuring the involvement of virtually every leading stakeholder in the field and the widest possible reach of the ILSA's impact. The dissemination of the results will also be facilitated by three key actions: the production of a short film illustrating the ILSA training programme, the collaboration with the accessibility focused radio station Fred Film Radio (which will reach 6.7 million people a year through 25 European language channels) and the inclusion of ILSA in the EU-funded MAP, the first on line platform on media accessibility that will reach the key stakeholders worldwide. This is a critical moment for media accessibility. Given the growing demand for access to live content in a foreign language, ILS will be produced sooner or later. What is at stake here is the quality of the product. Only through a research-informed comprehensive training programme such as the one proposed here by ILSA will it be possible to ensure that this new service meets the required standards regarding the product and the working conditions of the professionals involved. This is an essential step to guarantee a truly wider access that can include and integrate both deaf and foreign audiences in the audiovisual, educational, political and social life of the countries in which they are living

    The eXtreme-DataCloud project solutions for data management services in distributed e-infrastructures

    No full text
    The eXtreme DataCloud (XDC) project is aimed at developing data management services capable to cope with very large data resources allowing the future e-infrastructures to address the needs of the next generation extreme scale scientific experiments. Started in November 2017, XDC is combining the expertise of 8 large European research organisations. The project aims at developing scalable technologies for federating storage resources and managing data in highly distributed computing environments. The project is use case driven with a multidisciplinary approach, addressing requirements from research communities belonging to a wide range of scientific domains: Life Science, Biodiversity, Clinical Research, Astrophysics, High Energy Physics and Photon Science, that represent an indicator in terms of data management needs in Europe and worldwide. The use cases proposed by the different user communities are addressed integrating different data management services ready to manage an increasing volume of data. Different scalability and performance tests have been defined to show that the XDC services can be harmonized in different contexts and complex frameworks like the European Open Science Cloud. The use cases have been used to measure the success of the project and to prove that the developments fulfil the defined needs and satisfy the final users. The present contribution describes the results carried out from the adoption of the XDC solutions and provides a complete overview of the project achievements

    The eXtreme-DataCloud project solutions for data management services in distributed e-infrastructures

    Get PDF
    The eXtreme DataCloud (XDC) project is aimed at developing data management services capable to cope with very large data resources allowing the future e-infrastructures to address the needs of the next generation extreme scale scientific experiments. Started in November 2017, XDC is combining the expertise of 8 large European research organisations. The project aims at developing scalable technologies for federating storage resources and managing data in highly distributed computing environments. The project is use case driven with a multidisciplinary approach, addressing requirements from research communities belonging to a wide range of scientific domains: Life Science, Biodiversity, Clinical Research, Astrophysics, High Energy Physics and Photon Science, that represent an indicator in terms of data management needs in Europe and worldwide. The use cases proposed by the different user communities are addressed integrating different data management services ready to manage an increasing volume of data. Different scalability and performance tests have been defined to show that the XDC services can be harmonized in different contexts and complex frameworks like the European Open Science Cloud. The use cases have been used to measure the success of the project and to prove that the developments fulfil the defined needs and satisfy the final users. The present contribution describes the results carried out from the adoption of the XDC solutions and provides a complete overview of the project achievements
    corecore